[FIX] Fitter: Fix infinite recursion in getattr #1977

pavlin-policar · 2017-01-29T08:34:08Z

Issue

Running fitters on large enough datasets through the test and score widget and setting it to cross validation would result in a stack overflow error.

I am not particularly sure why this happened, my theory is that when test and score receives a large enough dataset, it runs cross validation in parallel, and that the learner attributes were probably accessed statically and failing.

Unfortunately, I have no idea what caused this and how exactly it was called, so I don't really know how to test against this. But since I will most likely change this in the near future, I wouldn't spend too much time on this.

Description of changes

No longer rely on self.kwargs being on the object, but raise attribute error by explicitly getting kwargs with __getattribute__.

Includes

Code changes
Tests
Documentation

codecov-io · 2017-01-29T09:02:09Z

Codecov Report

Merging #1977 into master will increase coverage by 19.21%.
The diff coverage is 81.81%.

@@             Coverage Diff             @@
##           master    #1977       +/-   ##
===========================================
+ Coverage   70.22%   89.43%   +19.21%     
===========================================
  Files         343       90      -253     
  Lines       54092     9190    -44902     
===========================================
- Hits        37984     8219    -29765     
+ Misses      16108      971    -15137

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ca41f6...49188ee. Read the comment docs.

kernc · 2017-01-29T16:09:28Z

joblib tries to pickle its arguments.
pickling accesses obj.__getstate__ for "instructions".
__getattr__ handler calls self.get_learner(None) since self.problem_type is not yet set, since we haven't yet fitted anything.
This raises AttributeError in get_learner().
Python thinks obj doesn't have the attribute __getattr__ was called with, so it calls __getattr__ with it.
...

I guess the comment in get_learner() is wrong and a TypeError should be raised. With this change, test_evaluation_testing.TestCrossValidation.test_njobs() with learner set to one of the fitters doesn't crash anymore.

pavlin-policar · 2017-01-29T17:26:01Z

Wow, thank you very much. I don't think I would've figured that out by myself. That makes the fix much cleaner :)

pavlin-policar · 2017-02-02T17:10:26Z

So I noticed this was failing in some cases, so I tried to try to make things a bit more straightforward.

Note: This is based on #1968 because I made some changes to the learner widget tests that enable testing fitters, which turned out to be useful here as well.

I've made two rather big changes from last time:

To make Learners simpler in general, we now assume that they will always have a params attribute (this is added to the main learner docstring, but I don't really have a good way to enforce this). The only change this required was on Tree learners, since everything else already had params. It may be nicer to have all the learner params as actual attributes on the learners, which could probably be done easily, but this required less changes. @kernc what do you think about this? The api on learners is still kind of a mess, but this could be an improvement.
I've removed all state from Fitters. Storing the problem type on the fitter was causing all kinds of unpredictable behaviour and was difficult to understand, so I've removed that entirely. @janezd what do you think about this? This should make the fitters simpler.

This does also fix the error that @lanzagar pointed out last week.

janezd

I'm a bit confused with some changes related to #1986. Could you rebase to master now (#1986 is merged)?

janezd · 2017-02-09T13:09:05Z

Orange/classification/tree.py

@@ -64,6 +64,8 @@ def __init__(
        self.min_samples_split = min_samples_split
        self.sufficient_majority = sufficient_majority
        self.max_depth = max_depth
+        self.params = {k: v for k, v in vars().items()
+                       if k not in ('args', 'kwargs')}


What is in vars() -- except globals, which we don't like anyway?! I don't like this blind copying.

This was an attempt to make the Learner API more intuitive. Currently, Orange learners have their settings (or parameters) saved to the object themselves, whereas sklearn learners have a special params field that stores all that data. This field is used heavily in the tests and only rarely in any other code. This makes dealing with the tests a bit confusing. It would probably be better that to switch the field to be protected, and then doing some magic with __getattr__ to make those fields accessible to tests. If this were done, the API to the learners would be a bit more straightforward, but if you think it's unnecessary, I can remove it altogether. This would probably be cleaner than the blind copying that I've done here.

I still don't understand what is the content of vars. It contains binarize, max_depth, min_samples_leaf, min_samples_splet and sufficient_majority; if these need to be copied to params, I'd copy them explicitly. Besides, it probably contains self, and besides that, all global names in the module, which you don't want to copy.

I am probably missing something here.

No you're not missing anything at all, I simply tried to emulate what the constructors in the sklearn learners were doing, which use vars(). I'll change it so the params are explicitly copied. I suppose the same issue is then present in all the sklearn learners. I could fix that if you'd like, in a separate PR, this is kind of a mess already.

You're right, it's uglier (although perhaps better).

Leave it like this for now, and perhaps change this in all SKL learners. Perhaps by adding something like this to SKLLearner

from inspect import signature ... def update_params(self, values): param_names = signature(type(self).__init__).parameters self.params.update({name: values[name] for name in param_names if name not in {"args", "kwargs"})

which would then be called from __init__ of derived classes with self.update_params(vars()). This would only copy the names that were given as arguments.

I see that there's already some similar magic in our SKL wrappers.

Not in this PR.

pavlin-policar · 2017-02-17T08:52:23Z

Sorry this has taken me so long to get to, but I'm at it again.

I'm guessing you meant changes related to #1968. The only thing these two have in common is some shared code to make the tests work with fitters.

janezd · 2017-02-17T10:29:26Z

I will try to merge all your PRs today, so they can be a part of the next release, which is due soon.

The first three commits of this PR are already in the master. So if you just rebase to master, they will disappear here and make it easier to review the code.

This occurred when a large dataset was used in the test and score widget using cross validation. Changing the exception from AttributeError to TypeError fixed this problem.

pavlin-policar · 2017-02-17T10:54:45Z

Yeah, I'm sorry, I had forgotten to rebase, it's done now.

It is necessary to specify which problem type any param on the learner is valid for, and I had forgotten to do this for the SVM learner. This fixes that.

ajdapretnar · 2017-02-20T09:55:01Z

This fixes the issue I had when training a classification tree on employee attrition data.

pavlin-policar force-pushed the fix-fitter-recursion branch 2 times, most recently from 32963d2 to ac6761c Compare January 29, 2017 17:25

pavlin-policar force-pushed the fix-fitter-recursion branch from ac6761c to dac6438 Compare February 2, 2017 16:56

janezd reviewed Feb 9, 2017

View reviewed changes

pavlin-policar added 3 commits February 17, 2017 11:53

Fitter: Fix infinite recursion in __getattr__

e39db1d

This occurred when a large dataset was used in the test and score widget using cross validation. Changing the exception from AttributeError to TypeError fixed this problem.

Fitter: Remove state from fitter

8ee65c0

Learner: The attribute is now expected everywhere

928f3b5

pavlin-policar force-pushed the fix-fitter-recursion branch from dac6438 to 928f3b5 Compare February 17, 2017 10:54

pavlin-policar added 2 commits February 17, 2017 12:04

Tree: Explicitly copy params

bab229f

OWSVM: Fix failing tests due to problem_specific params

49188ee

It is necessary to specify which problem type any param on the learner is valid for, and I had forgotten to do this for the SVM learner. This fixes that.

janezd merged commit 9b3faf4 into biolab:master Feb 23, 2017

pavlin-policar deleted the fix-fitter-recursion branch February 23, 2017 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Fitter: Fix infinite recursion in getattr #1977

[FIX] Fitter: Fix infinite recursion in getattr #1977

pavlin-policar commented Jan 29, 2017

codecov-io commented Jan 29, 2017 •

edited

Loading

kernc commented Jan 29, 2017

pavlin-policar commented Jan 29, 2017 •

edited

Loading

pavlin-policar commented Feb 2, 2017

janezd left a comment

janezd Feb 9, 2017

pavlin-policar Feb 17, 2017

janezd Feb 17, 2017

pavlin-policar Feb 17, 2017

janezd Feb 17, 2017

pavlin-policar commented Feb 17, 2017

janezd commented Feb 17, 2017

pavlin-policar commented Feb 17, 2017

ajdapretnar commented Feb 20, 2017

[FIX] Fitter: Fix infinite recursion in __getattr__ #1977

[FIX] Fitter: Fix infinite recursion in __getattr__ #1977

Conversation

pavlin-policar commented Jan 29, 2017

Issue

Description of changes

Includes

codecov-io commented Jan 29, 2017 • edited Loading

Codecov Report

kernc commented Jan 29, 2017

pavlin-policar commented Jan 29, 2017 • edited Loading

pavlin-policar commented Feb 2, 2017

janezd left a comment

Choose a reason for hiding this comment

janezd Feb 9, 2017

Choose a reason for hiding this comment

pavlin-policar Feb 17, 2017

Choose a reason for hiding this comment

janezd Feb 17, 2017

Choose a reason for hiding this comment

pavlin-policar Feb 17, 2017

Choose a reason for hiding this comment

janezd Feb 17, 2017

Choose a reason for hiding this comment

pavlin-policar commented Feb 17, 2017

janezd commented Feb 17, 2017

pavlin-policar commented Feb 17, 2017

ajdapretnar commented Feb 20, 2017

[FIX] Fitter: Fix infinite recursion in getattr #1977

[FIX] Fitter: Fix infinite recursion in getattr #1977

codecov-io commented Jan 29, 2017 •

edited

Loading

pavlin-policar commented Jan 29, 2017 •

edited

Loading